Assignment 3
Report for the third assignment of Effective MLOps: Model Development course.
Created on March 31|Last edited on April 2
Comment
Content
Data Partitioning Validation
- In the picture below, one can see the distribution of the target variable ("rating") is the same in the train and validation sets
histograms
- Concerning data leakage, two controversial features are "book_id" and "user_id". Depending on the goal of the project, these may need to be completely removed. However, given the objective is to reach the best score F1 score in the competition, these features were kept. Not surprisingly, the two reached the highest importance among all the features, with a significant distance to the third most important feature
- Additionally, many other features were registered after the rating itself, namely "n_votes" and "n_comments" which may also be problematic depending on the use case of the final model. Once again, the goal of the project was to reach the best possible score, hence these fields were used in training
- As the problem is not framed as a time series task, the time features were not considered correspondingly
Evaluation Metric
- The evaluation metric used by the competition is the F1 score, hence, this was primarily used in this project
- No threshold metrics were provided by the challenge organizers
Model Registry
- Created model registy
- Linked best run version
Error Analysis
- The F1 score on validation set is 0.4, while the score is 0.38 on the test data, implying no overfitting on the validation set
- The confusion matrix below shows that the model does not make big mistakes, with the majority of the predictions close to the true values of the ratings, as expressed by the lighter shades on the main diagonal. Generally, when the model fails to predict the rating correctly, it misses the value by 1-2 points
- In this case, it is not possible to inspect and confirm whether the true labels are indeed true, as in an image segmentation problem, for intstance
Run set
1
💡
Add a comment